Interpretation of DNA data within the context of UK forensic science — evaluation

Forensic DNA provides a striking contribution to the provision of justice worldwide. It has proven to be crucial in the investigative phase of an unsolved crime where a suspect needs to be identified, e.g. from a DNA database search both nationally and internationally. It is also a powerful tool in the assignment of evidential weight to the comparison of a profile of a person of interest and a crime scene profile. The focus of this document is the evaluation of autosomal profiles for criminal trials in the UK. A separate review covers investigation and evaluation of Y-STR profiles, investigation using autosomal profiles, kinship analysis, body identification and Forensic Genetic Genealogy investigations. In less than 40 years, forensic DNA profiling has developed from a specialist technique to everyday use. Borrowing on advances in genome typing technology, forensic DNA profiling has experienced a substantial increase in its sensitivity and informativeness. Alongside this development, novel interpretation methodologies have also been introduced. This document describes the state of the art and future advances in the interpretation of forensic DNA data.


Introduction
Forensic DNA is now a cornerstone of forensic science. Since its introduction into casework in the 1980s [1], it has benefited from a scientific basis in genetics for its production, and a probabilistic basis for its reporting. This document presents the state of the art in the UK and the future for the interpretation of DNA data within the context of forensic cases. The content of this topic is vast and therefore the review has been divided into two reviews. Here the focus is on the evaluation of autosomal DNA, i.e. assigning evidential value to a DNA profile from a crime scene and a profile from a person of interest (POI). A separate review covers the provision of intelligence for investigation (identifying suspects), evaluation of Y-STR profiles, kinship analysis, body identification and Forensic Genetic Genealogy investigations.
DNA data consists of (a) information provided in the production of a DNA profile, e.g. DNA quantity and percentage of contribution by donors; (b) contextual information of the case; and (c) external databases such as national DNA databases for investigation or allele frequency databases for assigning evidential weight.
Evaluation of DNA data provides powerful evidence to help address questions of the origin of the DNA in the sample and assists with the mechanism for deposition of DNA on an item. Interpretation of DNA data for evaluation is the assignment of probative value to DNA findings, consisting of DNA profiles from a POI and from a crime scene, when considering the findings under the views of the prosecution and the defence in a criminal case.
These findings are put into the context of the particular case circumstances so that the legal practitioners and jury can use this information in their deliberation giving it the right level of importance. It is performed with the application of the case assessment and interpretation (CAI) methodology. A central component of CAI is the use of Bayes' theorem which has at its core the likelihood ratio (LR) as a way of assigning evidential weight to DNA findings.

Provision of Forensic Services in the UK
There are three jurisdictions in the UK: England and Wales, Scotland and Northern Ireland. The forensic services are provided by police laboratories, private forensic providers, specialist companies and Universities, with quality being overseen by the Forensic Science Regulator of England and Wales, and procedures being accredited by the United Kingdom Accreditation Service (UKAS).
From a crime scene to a DNA profile Some regions of DNA contain information that codes for the production of the proteins that carry out functions within cells. Some DNA does not code for any protein and since the selection pressure on this is less than for the genes encoding proteins, more mutations can accumulate in these regions. In current forensic DNA analysis this variation in non-coding DNA is used to distinguish between individuals. The region consists of a short sequence of, usually, three to four nucleotides which are repeated a number of times, called Short Tandem Repeats (STR). The number of copies of repeated DNA is defined as an allele, e.g. allele 13 at DNA region (marker or locus) D18S51 has 13 copies of sequence AGAA and can be written as [AGAA]13. Figure 1 shows a diagrammatic description of the process of producing a DNA profile. DNA is extracted from items or samples and the amount recovered is quantified. The DNA is then divided into parts (aliquots) and amplified using polymerase chain reaction (PCR) to increase the amount of DNA to reach detectable levels. Several regions, usually called loci or markers, are tested.
The amplified DNA is then separated by size using capillary electrophoresis to produce an electropherogram (epg), which consists of peaks where the x-axis location is the length of DNA expressed as the number of STR repeats within each specific region, and peak heights in the y-axis are roughly proportional to DNA quantity. Figure 2 shows an epg of an ESI profile, described in Table 1. Software is then used to designate allelic peaks and artefacts. The whole process from extraction to designation is largely automated apart from specialised extraction methods, e.g. those required to extract DNA from difficult samples like bone.
A reference profile is produced from a sample taken from a known person under controlled conditions and with an optimal amount of DNA. Therefore, the designated profile, or genotype, provides all the information needed, see bottom of Figure 1. In contrast, an epg from a sample of questioned origin contains information, via peak heights, regarding several sources of uncertainty e.g. imbalanced between partner peaks, number of contributors, non-detected alleles (dropout), spurious peaks (dropin) and amount of DNA attributed to the contributors. Peak heights of alleles and some artefacts are used in the evaluation of data given hypotheses in relation to the origin of the DNA (see section titled "Sub-source").
A DNA profile consists of a group of locus specific profiles. The set of combined loci are often called multiplexes. There are a variety of multiplexes in use across the world with an overlap between the DNA loci included in different multiplexes, Table 1. New multiplexes are back compatible with older multiplexes, e.g. SGMPlus, and cross compatible for international database searching, e.g. CODIS expanded. For an in depth treatment of this topic see [2]. Currently in the UK, NGMSE, ESI and DNA23+ are used routinely.
DNA analysis is undergoing major changes as a spin-off from methods developed by the human genome project. As a consequence, the DNA sequence of forensically important regions will be recorded rather than simply measuring the number of STRs (see section titled "Sub-source"). The new technology called Massively Parallel Sequencing (MPS) will also include regions for forensic DNA phenotyping, i.e. incorporating regions of DNA that are informative about the physical features of the donor of a profile [3].

Interpretation foundation
Interpretation of DNA data in the UK and some parts of Europe follows the principles described in the CAI model [4][5][6][7][8]. The European Network of Forensic Science Institutes (ENFSI) supports this approach and provides guidelines for its application [9]. The Royal Statistical Society provides a set of four documents for guidance for Judges, Lawyers, Forensic Scientists and Expert Witnesses. One of them covers the CAI approach [10]. Recently, the forensic science regulator of England and Wales has published guidelines for the application of CAI across evidence types [11].
CAI promotes pre-assessment of potential findings as a way of mitigating cognitive bias [12]. CAI can be applied to an investigation, when a suspect needs to be identified based on a crime scene profile, and evaluation, when a suspect has been identified and the interest lies in how to assign the weight of evidence applied to the scientific findings.
A central concept for evaluation in CAI is that of propositions. In general, propositions can be classified as addressing the origin of the DNA (sub-source level [13], e.g. whether the DNA came from the POI), source of material (e.g. whether the semen came from POI), the activity (e.g. whether the POI had intercourse with the victim) and the offence level (e.g. whether the POI raped the victim). From this description it is clear that offence level is not in the remit of the forensic scientist.
Within CAI at least two mutually exclusive propositions must be considered. If the case circumstances, denoted here by I, do not provide information to formulate such propositions, it is not possible to evaluate the The text labels are the locus names.
DNA findings. In criminal cases one proposition represents the view of the prosecution, H 1 , and the other that of the defence, H 2 (see next section for some examples).
The logical approach for assigning a weight to DNA findings, denoted here by E, is based on Bayes' theorem in its odds form [14], Prior odds Â likelihood ratio = posterior odds Expressed in words, the posterior odds are calculated as the prior odds multiplied by the LR. In the UK, for DNA at sub-source level, the jury uses its common sense to assign the prior odds and combine it with the LR, provided by the forensic scientist, to obtain the posterior odds, as defined in R v Doheny & Adams [15].

Autosomal profile evaluation
At this stage there is a POI whose DNA profile has been produced from a reference sample taken from him/ her. The task then is to evaluate whose DNA was present in the questioned sample and how it got there.

Sub-source
The questioned sample may have come from one or more people. The complexities of calculating LRs considering all the sources of uncertainty, are similar. Therefore, in this section, we will consider a two-person mixture because it is rich enough to exemplify the features of mixture evaluation.
The court is interested in the origin of the DNA, which is addressed with a pair of propositions of the kind H 1 : The DNA came from a POI and an unknown person unrelated to the POI H 2 : The DNA came from two unknown people unrelated to each other and to the POI An important aspect of the propositions is that they are posited based on the case circumstances and can be refined in the light of the DNA findings. The evaluation can be adapted if a different alternative proposition is put forward by the defence. The reappraisal of the propositions is ideally performed prior to the trial because it is time intensive and this allows the full process of statement writing, including peer review, to take place.
An LR considering a pair of propositions can be calculated using a probabilistic genotyping system (PGS), described below. For example, consider a case where the LR is greater than a billion. This is reported in a statement for court as: The DNA findings are at least a billion times more likely if H 1 is true rather than H 2 is true. Following FSR guidelines, LRs that are greater than one billion are reported in the UK as 'at least one billion' [16]. There is a verbal scale [9,17] which is not intended for LRs calculated at the DNA sub-source level but for situations where no quantitative evaluation is possible. The International Society for Forensic Genetics (ISFG) has also produced guidelines [18].
A PGS combines the quality of the questioned profile and the rarity of the genotype pair combinations. The questioned profile consists of peak heights for each component detected. Weights, or probability densities, are assigned to the questioned profile assuming to have come from one putative genotype combination at a time from a large number of combinations. The rarity of a genotypes is assigned through a genotype probability, which requires a database of allele proportions published by the FSR of England and Wales [19], and stratified according to self-declared ethnicity [20]. It also requires allowances for both (a) the size of the allele proportion database and (b) subpopulation structure (shared distant ancestry) via a parameter usually called θ or F ST , [21].
Currently in the UK, there are two validated systems in routine use, a commercially available software, STRmix [22], and a proprietary system, LiRa [23]. Other systems are sometimes used such as freeware, DNAmixtures [24] EuroForMix [25] and likeLTD [26] and commercial software TrueAllele [27]. On rare occasions, such as for older unsolved (cold) cases, a discrete statistical model may be used for calculating an LR using LiRa discrete [28] or likeLTD discrete [29]. For a historic overview of PGS development see [30]. Guidelines for PGS validation has been produced in the UK [31], in Europe [32] and in the USA [33].
The calculation of an LR using a PGS requires entering the number of contributors, which is selected by the user. STRmix allows a range of values to be entered [34]. One approach is to estimate a probability distribution on the number of contributors using an external program [35,36] and make a decision on the value or values entered into the PGS [37]. Currently, PGSs can routinely perform LR calculations up to four person mixtures, however, some can perform five and six person mixture calculations.
For a more in depth description of evaluation of autosomal profile at sub-source level see [38][39][40].
In the future, for the specific task of addressing sub-source level propositions, STR regions will be typed down to base pairs, in addition to determining the number of repeats. Therefore, a single STR allele, e.g. allele 16, may have a number of variants resulting from differences in the base pair sequences. For example, allele 13 at DNA region D18S51 has at least two variants, variant 1301 has sequence [AGAA]13 and variant 1302 has sequence AGAA AGCA [AGAA]11 [41]. In 1301 the expression [AGAA]13 means that bases AGAA repeat 13 times, while in 1302 the second repeat is AGCA instead of AGAA followed by 11 repeats of AGAA.
There are several challenges for bringing MPS into practice. National databases will need to be updated to accommodate the additional variants, which requires careful consideration of internationally agreed nomenclature [42]. For the calculation of LRs, a database of allele counts, including the new variants, needs to be collated for each ethnic appearance populations within countries. Some work has been carried out already towards this goal [41]. A third challenge is to perform research on the behaviour of peak heights and artefacts when considering allele variants. Some research has been published to cover these aspects [42,43]. The first case using MPS has been reported by Peter de Knijff from Leiden University (Pers. Comm. P. de Knijff ).

Source
Sub-source level considers the origin of the DNA result while source adds the attribution of the DNA to a body fluid. Currently in the UK, subjective opinion of attributions is reported based on presumptive chemical tests, microscopy or visual appearance. Although there is an implicit pair of propositions, it is often expressed as, e.g. 'In my opinion, the male DNA component of the result can be attributed to semen' A review of body fluid identification method is given in [44]. DNA and RNA based methods are being developed. Some of these are in current use for casework at the Netherlands Forensic Institute (NFI) [45].
In the near future, LRs will be calculated for source level propositions at the NFI [46]. Incorporation of DNA and RNA based tests within MPS opens the possibility of calculating LRs that incorporate source and sub-source evaluation together.

Activity
An important question for the court is the activity by which and the time when a DNA stain was left on an item of interest. At present in the UK, a forensic scientist reports these as a subjective opinion, e.g. 'Based on my experience, the DNA results are what I would expect if the suspect had penetrated the victim. If an alternative version is provided, then I will reappraise my opinion'. Sometimes, a subjective opinion is aided by propositions, i.e.
H 1 : The suspect penetrated the victim, or H 2 : The suspect held hands with the victim, and reported as 'The DNA findings provide support for H 1 rather than H 2 '. The opinions are based on a number of studies addressing activities for deposition of DNA which consider Transfer, Persistence, Prevalence and Recovery (TPPR) [47]. These measure background DNA, i.e. the preexisting DNA unrelated to the incident which is found on surfaces. Transfer refers to the direct, or primary, and indirect, or secondary, deposition of DNA to items. Tertiary deposition may also be considered. Persistence refers to the retention and loss of DNA through other activities such as movement or washing. Recovery refers to locating the stain to be sampled and the efficiency of extracting DNA for analysis.
Publications on TPPR are very varied in terms of multiplexes, testing methods, factors that affect TPPR leaving forensic scientists with the challenging task of making sense of the literature and apply it to their case. Publications are sometimes gathered into databases, e.g. DNATrAC [48], to make the literature more accessible.
There is research and development exploring the use of Bayesian networks (BNs) for the evaluation of activity level propositions, see [49] for a review and [50] for some examples. To date, BNs have been used only to underpin an opinion which is then reported in court. It has taken some years from the first uses of BNs in forensic science [51,52] to reach a point where they are ready to support casework.
The Netherlands Register of Court Experts (Nederlands Register Gerechtelijk Deskundigen, NRGD) has recently introduced guidelines for registering forensic scientists who report activity level propositions to court [53]. The ISFG has also produced guidelines [54]. The FSR has published generic guidelines that are applicable to evaluative opinions that include activity level propositions [11]. A review of transfer research is given in [55]. Limitations of expert opinions is discussed in [56].
We envisage that in the future, probabilistic support systems (PSSs) will be developed to assist the forensic scientist to report activity level propositions. These will be underpinned by statistical models informed by data that can also incorporate expert knowledge. A PSS would be used for training, calibration of expert opinions and reporting. The statistical models and expert opinion would inform nodes in BNs, which would be an intrinsic part of a PSS.
The development of these systems would require fundamental experiments and observational studies on TPPR performed in following a systematic and structure methodology that will allow the produced data to be used by several organisations. Past initiatives have proposed these ideas [57].

Summary
• In forensic science, DNA data is interpreted to provide assistance to criminal courts in determining whether the DNA of a person of interest is present in a crime stain and how and when it was deposited.
• Case Assessment and Interpretation gives a structure that clarifies the interpretation requirements and mitigates against cognitive bias.
• Probabilistic genotyping systems address whether the DNA originated from the person of interest.